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ABSTRACT 


Formulas  are  derived  for  finding  the  areas  between  each 
pair  of  points  under  the  second-degree-polynomial  curve  defined 
by  three  equispaced  points  in  an  x-y  (cartesian)  coordinate 
system.  These  formulas  are  a  modification  of  Simpson's  numerical- 
integration  rule  which  gives  only  the  total  area  lying  under  the 
curve  between  the  initial  and  final  points.  The  formulas, 
implemented  by  a  Fortran  computer  subroutine  named  SIMCUM,  are 
useful  in  problems  where  it  is  necessary  to  find  integrals  under 
a  curve  defined  by  a  limited  number  Of  data  points,  and  the 
cumulative  Integral  is  desired  at  each  data  point  rather  than 
at  every  second  data  point  as  would  be  possible  with  the  ordinary 
form  of  Simpson's  rule.  With  a  fixed  number  of  data  points,  the 
method  gives  improved  accuracy,  compared  with  the  alternative  of 
using  the  trapezoidal  rule,  when  the  "true"  curve  is  continuous, 
not  a  straight  line,  and  is  reasonably  well  defined  by  the  data 
points.  For  a  specified  integration  accuracy,  a  considerable 
cost  saving  can  often  be  effected  by  using  this  method,  instead 
of  the  trapezoidal  rule  with  a  considerably  greater  number  of 
data  points. 

PROBLEM  STATUS 

This  is  a  final  report  on  one  phase  of  the  problem;  work 
is  continuing. 


AUTHORIZATION 

NRL  Problem  R02-55 
Project  RF-151-402-4011 
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A  MODIFIED  SIMPSON'S  RULE  AND  FORTRAN  SUBROUTINE  FOR  CUMULATIVE 
NUMERICAL  INTEGRATION  OF  A  FUNCTION  DEFINED  BY  DATA  POINTS 

INTRODUCTION 

In  many  scientific  and  engineering  problems,  it  is  desired 
to  perform  cumulative  numerical  integration  of  a  function  defined 
by  a  limited  number  of  data  points.  In  optical  ray  tracing,  for 
example,  the  data  points  might  be  the  index  of  refraction  at 
various  levels  in  a  layered  medium,  or  other  quantities  from 
which  the  refractive  index  could  be  computed.  A  similar  problem 
is  to  compute  the  cumulative  power  absorption  along  a  ray  path. 
(This  was  in  fact  the  problem  that  led  the  author  to  develop  the 
method  to  be  described.)*  The  limitation  on  the  number  of  points 
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It  is  not  known  whether  the  method  is  new,  but  it  is  not  de¬ 
scribed  in  any  textB  readily  available  to  the  author.  Jon  Wilson 
of  the  NRL  Radar  Division  Analysis  Staff,  has  pointed  out  that 
formulas  given  in  the  book  "Numerical  Integration"  by  P.  J.  Davis 
and  P.  Rabinowitz  (Blaisdell;  Waltham,  Mass.,  1967;  pp.  22-23) 
are  similar  in  principle  to  the  formulas  derived  here,  but  they 
are  not  there  developed  to  yield  the  explicit  formulas  given  here 
Their  formulas  are  also  generalized  to  apply  to  nonequally  spaced 
abscissa  values,  which  are  discussed  later  in  this  report. 


s 


may  be  due  to  the  cost  of  obtaining  experimental  data  at  smaller 
intervals,  or  it  may  equally  well  be  that  the  points  are  computed 
by  a  lengthy  procedure,  so  that  obtaining  additional  points 
would  require  an  excessive  amount  of  computer  time. 

The  general  nature  of  the  problem  in  abstract  mathematical 
terms  is:  Given  a  function  y(x)  defined  by  n  data  points  y(x1), 
y(xg) ,  ...  y(xn)  with  equally  spaced  abscissa  (x)  values, 
evaluate: 

xi 

li  ■  J  y(x)  dx,  i  -  2,  3,  4,  ...  n.  (1) 

The  simple  way  to  do  this  is  to  use  the  trapezoidal  rule 

of  numerical  integration  and  apply  the  recursion  formula: 

xi  xi-x 

J  y(x)  dx  s  J  y(x)  dx  +  (  ^  [y(xi)  +  y(xi-x]  (2) 

As  is  well  known,  Simpson's  rule  gives  a  much  more  accurate 
value  of  the  Integral  than  does  the  trapezoidal  rule  if  a 
continuous  smooth  curve  through  the  data  points  represents  the 
"true"  function  and  if  this  curve  is  not  a  straight  line.  The 
trouble  with  Simpson's  rule  in  this  application  is  that  since 
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It  requires  triplets  of  points  rather  than  pairs  of  points  for 

xi 

each  cumulative  step  in  the  integration,  values  of  J  y(x)  dx 

can  be  obtained  only  for  i  ■  3,  5,  7,  9,  ...  etc.  This  limitation 
is  an  objection  to  the  use  of  Simpson's  rule  when  the  available 
number  of  data  points  is  so  limited  that  obtaining  the  cumulative 
integral  only  for  odd-numbered  data  points  is  unacceptable. 

The  purpose  of  this  report  is  to  describe  a  modification  of 
Simpson's  rule  which  overcomes  this  objection  and  permits  the 
cumulative  Integral  to  be  found  at  each  data  point. 

METHOD 

Simpson's  rule  is  basically  a  formula  for  numerically 
finding  the  area  under  the  parabolic  (second-degree  polynomial) 
curve  defined  by  just  three  points  in  a  plane,  the  points  being 
separated  by  equal  intervals  along  the  x-axis  of  an  x-y  (cartesian) 
coordinate  system.  Such  a  set  of  points  and  the  corresponding 
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curve  are  depicted  in  Fig.  1.  The  are*  under  this  curve,  by  the 


xa  »ya 


Fig.  1.  Triplet  of  equiapaced  point*  in  xy  coordinate  system; 
second-degree-polynomial  curve  shown  dotted. 

ordinary  Simpson's  rule.  Is: 

*3 

A  -  J  y(x)  dx  -  ^  (yx  +  4ya  +  y3)  (3) 

where  y(x)  -  axa  +  bx  +  c,  with  a,  b,  and  c  chosen  to  make  y(x) 
fit  the  given  data  points.  The  quantity  h  Is  the  separation  of 
successive  points  on  the  x-axls;  i.e.,  h»x3-xa-xa-xi. 

Since  x1 ,  Xb  ,  and  x8  do  not  appear  explicitly  on  the  right- 
hand  side  of  Eq.  (3),  it  is  permissible  in  analysis  of  the  matter 
to  set  xl  ■  -h,  -  0,  and  +h.  In  these  terms,  the  problem 
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Chat  has  been  posed  is  Co  find  the  formulas,  in  terms  of  h, 


yi  ,  y2 ,  and  y3 ,  that  give 

o 

Ai  ■  1  y(x)  dx 

-h 


and 


h 


y(x)  dx. 


The  formulas  are  found  to  be: 


(4) 


(5) 


*i  "  |  (  |  yi  +  2ya  -  y  y3>  (6> 

Aa  -  |  (-  l  yi  +  2ya  +  £  ys>  (7) 

Of  course,  Aj  +  Ag  -A,  and  it  is  evident  that  adding  the  right- 
hand  sides  of -Eqs.  (6)  and  (7)  gives  the  usual  Simpson's  rule, 

Eq.  (3). 

DERIVATION  OF  THE  FORMULAS 

If  the  curve  in  Fig.  1  is  represented  by 

y(x)  ■  axa  +  bx  +  c  (8) 

and  (x1 ,  yx),  (Xg  ,  y3),  and  (Xg ,  y3)  are  points  on  the  curve. 


then 


(9) 

(10) 
(ID 


a  *  (yx  -  2y a  +  y3)/2h3 

b  -  (y3  -  Yi)/2h 

C  -  y2 

Eq.  (4)  in  these  terms  becomes: 
r°  r 

Ax  -  Jh  y(x)  dx-[^-  +  £2L.  +cx]  h-^.-^+ch  (12) 

Substituting  the  results  of  Eqs.  (9),  (10),  and  (11)  in  Eq.  (12) 
gives  Eq.  (6).  A  similar  procedure  starting  with  Eq.  (5)  in¬ 
stead  of  Eq.  (4)  yields  Eq.  (7). 

FORTRAN  IMPLEMENTATION  OF  THE  METHOD 

Using  these  results,  the  recursion  formulas  now  appli¬ 
cable  instead  of  the  trapezoidal  formula,  Eq.  (2),  are 


Xt  Xi.j 


y(x)  dx  - 

t 

y(x) 

dx  +  |  [fyi-i  +  2n  -  iyi+i  J 

(13) 

and 

xi-h 

xi 

r  1 

f  y(x)  dx  - 

J 

y(x) 

dx  +  |  -  i  yi-x  +  2yt  +  |  yt+x 

(14) 

*i 

J  L  J 

in  which  i  is  an 

even 

integer  and  h  ■  x^  -  x^.j  «  Xi+j  -  xj 
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The  initial  (least  value)  of  x  is  denoted  by  x1 .  Given  a  set 
of  points  (xi,  yi)  defined  by  Fortran  arrays  X(I)  and  Y(I), 
these  recursion  formulas  may  be  incorporated  into  a  "modified 


Simpson's  rule"  cumulative  integration  subroutine  as  follows. 


SUBROUTINE  SIMCUM  (Y ,H,N .AREA) 

DIMENSION  Y (N) ,AREA(N) 

AREA(1)«0. 

FACr-H/3. 

TERMl«i.25*Y(l) 

TERM2—  0.25*Y(1) 

K-l 

D0  1  1-3, N,2 
J-I-l 

TERM3  -  2,*Y(J) 

TERM4-1.25*Y(I) 

TERM5— 0.25*Y(I) 

AREA(J)-  AREA(K)  *  FAC* (TERM  +  TERH3  +  TERMS) 

AREA(I)  -  AREA(J)  +  FAC* (TERM 2  +  TERM 3  +  TERM4) 

TERM1-TERM4 

TERM  2-TERMS 

K-I 

1  CONTINUE 

IF  (2*(N/2) .EQ.N)  3,4 

3  AREA (N) -AREA (K)  +  FAC*(-0.25*Y(J)  +  2.*Y(K)  +  1.25*Y(N)) 

4  RETURN 
END 


The  parameter  Y  is  the  array  of  values  of  y,  H  is  the 
Interval  between  successive  values  of  X,  AREA  is  the  computed 
array  of  successive  (cumulative)  values  of  the  Integral,  and  N 
is  the  Integer  number  of  values  of  y  in  the  array.  Of  course, 
Y  and  AREA  must  be  properly  dimensioned  in  the  program  or 
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subroutine  from  which  SIMCUM  is  called.  Statement  3  provides 
for  calculation  of  the  integral  through  the  leftover  last 
interval  if  N  happens  to  be  an  even  number. 

UNEQUALLY  SPACED  DATA  POINTS 

These  formulas  and  the  subroutine  are  based  on  the  assumption 
that  the  data  points  are  equally  spaced  along  the  independent- 
variable  (x)  axis.  A  virtue  of  the  trapezoidal  rule  is  that  it 
can  also  accommodate  unequally  spaced  points.  Actually,  a 
further  modification  of  Simpson's  rule  can  be  made  to  apply  to 
this  case  also.  The  formulas  are  quite  cumbersome,  and  the 
author  has  not  derived  them  in  detail.  They  can  be  derived  by 
the  procedure  illustrated  by  Eq.  (12),  except  that  the  limits  of 
integration  are  now  Xj  and  Xg ;  an  equation  analogous  to  Eq.  (6) 
is  thus  found.  Then  the  limits  are  changed  to  x,  and  Xg ,  to 
obtain  an  equation  analogous  to  Eq.  (7).  The  cumbersomeness 
arises  from  the  fact  that  the  expressions  for  a,  b,  and  c  of 
Eq.  (8)  are  much  more  complicated  when  the  intervals  are  unequal. 

In  most  practical  cases,  data  points  are  obtained  at  equal  Intervals 
along  the  x-axis,  so  that  it  did  not  seem  worthwhile  to  derive  the 
complicated  formulas  for  unequal  intervals. 
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It  may  sometimes  happen  that  a  set  of  data  points  has 
spacings  that  are  constant  over  subintervals  but  not  over  the 
entire  interval  of  the  x-axis.  In  such  a  case,  of  course, 
Subroutine  SIMCUM  can  be  successively  applied  to  each  subinterval. 

A  TEST  OF  THE  FORTRAN  SUBROUTINE 

Subroutine  SIMCUM  was  tested  by  first  defining  an  array  of 

values  of  y(x)  -  sin  (x)  for  10  equal  subintervals  of  x  over  the 

total  Interval  it/ 2.  The  "correct"  result  was  found  by  using  the 

b 

analytic  Integration  formula  J  sin  (x)  dx  »  cos  (a)  -  cos  (b) . 

a 

The  cumulative  Integral  was  found  by  using  SIMCUM  and  the  result 
was  compared  with  that  of  trapezoidal -rule  Integration  (Eq.  (2)), 
for  each  subinterval.  The  computations  were  done  using  the  NRL 
CDC-3800  computer.  The  results  are  shown  in  the  following  table: 


-  9  - 


Sub interval 

Correct 

Result 

SIMCUM 

Result 

Trapezoidal 

Rule  Result 

1 

0.012312 

0.012337 

0.012286 

2 

0.048944 

0.048944 

0.048843 

3 

0.108994 

0.109016 

0.108769 

4 

0.190983 

0.190984 

0.190590 

5 

0.292893 

0.292912 

0.292291 

6 

0.412215 

0.412216 

0.411367 

7 

0.546010 

0.546023 

0.544886 

8 

0.690983 

0.690985 

0.689562 

9 

0.843566 

0.843572 

0.841830 

10 

1.000000 

1.000003 

0.997943 

Thus  the  cumulative  error  using  SDiCUM  is  seen  to  be  3  x  10-6 
compared  to  approximately  2  x  10"3  using  trapezoidal-rule  inte¬ 
gration  with  the  same  data  points.  Of  course,  this  is  a  special 
case,  but  it  illustrates  the  superiority  of  Simpson's  rule  when 
the  number  of  data  points  is  limited,  if  the  "true"  function  is 
a  smooth  curve.  A  test  was  sIbo  made  using  the  same  function 
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divided  into  9  subintervals,  so  that  there  were  10  data  points 
(an  even  number).  This  causes  Statement  3  of  Subroutine  SIMCUM 
to  be  utilized.  The  results  were  comparable  in  accuracy  to 
those  tabulated  above. 

A  subroutine  named  SIMRUN  is  available  in  the  CDC  CO-OP 
Program  Library  which  does  a  Simpson's  Rule  integration  similar 
to  that  of  SIMCUM  except  that  it  finds  the  cumulative  integral 
from  Eq.  (3),  and  is  therefore  limited  to  obtaining  values  at 
only  the  odd-numbered  points.  In  terms  of  the  test  of  SIMCUM 
tabulated  above,  this  corresponds  to  the  even-numbered  intervals. 
The  test  described  was  also  done  using  SIMRUN,  and  (as  expected) 
the  results  were  identical  to  those  obtained  with  SIMCUM  at  the 
even-numbered  intervals. 

APPLICABILITY  OF  THE  METHOD 

The  method  of  cumulative  integration  described  is  applicable 
when  the  following  criteria  are  met: 

(1)  The  function  y(x)  to  be  Integrated  is  defined  by  a 
discrete  set  of  data  points  (equally  spaced  along  the  x-axls 
of  the  coordinate  system)  rather  than  by  a  continuous  function; 
or,  it  is  defined  by  a  "function"  that  is  so  complicated  that  a 
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lengthy  computation  is  required  to  calculate  a  single  nusserical 
value  of  y. 

(2)  The  function  represented  by  the  data  points,  though  not 
necessarily  defined  analytically,  ia  known  to  be  a  function  that 
varies  slowly  relative  to  the  x-axis  separation  of  the  independent- 
variable  values,  i.e.,  the  data  points  define  the  function  fairly 
well  in  the  sense  that  a  second-degree  polynomial  is  a  reasonable 
interpolating  function  for  any  three  successive  points. 

(3)  The  function  is  not  a  straight  line,  and  the  desired 
accuracy  is  better  than  that  obtained  by  straight-line  inter¬ 
connection  of  the  successive  points  (trapezoidal  rule)  . 

(4)  The  cost  of  the  increase  in  computing  time  resulting 
from  the  use  of  Subroutine  SIMCUM  instead  of  a  trapezoidal-rule 
subroutine  is  probably  small  compared  to  the  cost  of  obtaining 
a  greatly  increased  density  of  data  points  (and  also  therefore 
performing  a  gr<*  iter  number  of  trapezoidal-rule  computations) .  A 
means  of  evaluating  this  criterion  is  difficult  to  specify  in 
general  terms,  but  in  most  specific  cases  the  answer  is  likely  to 
be  obvious. 

The  method  definitely  should  not  be  used  when  only  the 
integral  o/er  the  total  interval,  rather  than  the  cumulative  set 
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of  integrals,  is  needed.  In  that  case,  the  "ordinary"  form  of 
Simpson's  rule  should  be  used,  although  a  problem  is  created 
when  the  number  of  data  points  is  even  instead  of  odd.  This 
problem  can  be  solved  by  using  the  three  last  statements  of 
Subroutine  SIMCUM  at  the  end  of  the  subroutine.  A  Simpson's 
rule  subroutine  incorporating  this  feature  is  given  in 
Appendix  A.  When  the  data  points  are  known  to  define  a 
succession  of  straight-line  segments,  or  if  it  is  desired  to 
interpret  the  data  as  if  they  do,  the  trapezoidal  rule,  Eq.  (2), 
should  be  used. 

Formal  documentation  of  Subroutine  SIMCUM  has  beau  sub¬ 
mitted  to  the  NRL  Research  Computation  Center  for  possible 
Issuance  as  a  Computer  Note  or  Computer  Bulletin. 
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APPENDIX  A 


FORTRAN  SUBROUTINES  FOR  TRAPEZOIDAL- RULE  AND  "ORDINARY" 
_  SIMPSON'S  RUIZ  INTEGRATION 


For  convenient  reference,  •  listing  of  the  trapesoidal- 
rule  subroutine  written  by  the  author  for  use  in  comparing  that 
rule  with  the  modified  Simpson's  rule  is  given  below.  The 
parameters  are  Identical  to  those  of  Subroutine  SIMCUM. 


SUBROUTINE  TRAPZ0ID  (Y.H.N.AREA) 
DIMENSION  Y(N)  ,AREA(N) 

AREA(l)  -  0. 

H2  -  H*.5 
DO  1  1-2  ,N 
J  -  1-1 

1  AREA(I)  -  AREA(J)  +  H2*(Y(I)  +  Y(J)) 
END 


Also  given  below  for  convenience  is  a  subroutine  using  the 
conventional  Simpson' a  rule  in  a  form  suitable  for  finding  the 
overall  integral  of  a  set  of  data  points,  rather  than  the  set  of 
cumulative  Integrals.  AREA  in  this  subroutine  is  not  dimensioned. 
Provision  Is  made  for  inclusion  of  the  left-over  last  subinterval 
when  the  total  manber  of  data  points  is  an  even  number  by  using 
the  formula  of  Eq.  (14). 


SUBR0UTINE  SIMS0N  (Y,H,N .AREA) 

DIMENSION  Y(N) 

SUM  -  0. 

J-l 

D0  1  1-3, N,2 

SUM-SUM  +  Y  (J)  +  4.*Y(I-1)  +  Y(I) 

J-I 

1  CONTINUE 

IF  (2*(N/2).EQ.N)  3,4 

3  SUM  -  SUM  -  0.25*Y(N-2)  +  2.*Y(J)  +  1.25*Y(N) 

4  AREA  -  SUM  *  H/3. 

END 


This  subroutine  was  tested 


by  integrating 


\  sin  (x)  dx, 


o 

with  N  -  10  (ao  that  Statement  3  was  utilized) .  The  result 
obtained  was  AREA  -  0.999998460.  (As  noted  previously,  the 
exactly  correct  result  is  1.0.) 
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Please  make  the  following  correction: 

1.  Page  7,  in  the  Fortran  listing  of  subroutine  SIMCUM, 
13th  line,  the  first  asterisk  should  be  a  +  sign.  The  state¬ 
ment  should  read: 

AREA(J)  =  AREA®  +  FAC*(TERM1  +  TERM3  +  TERM5) 


